REMARKS 



Applicants submit that the amendments to the 
specification, including to the Brief Description of the Drawings, only 
amend the figure identifiers to correspond to the identifiers on the 
drawings submitted with the application; i.e., Figures 1, 2, 3, 4, 5, 7, 
10, 13, 14, 15, and 21 are amended to Figures 1A/1B, 
2A/2B/2C/2D/2E/2F/2G/2H/2I/2J, 3A/3B/3C/3D/3E/3F/3G/3H/3I, 
4A/4B, 5A/5B/5C/5D, 7A/7B/7C, 1 OA/1 OB/1 OC, 

13A/13/B/13C/13D/13E/13F/13G/13H/13I/13J/13K/13L/13M/13 
N/130/13P/13Q/13R/13S/13T/13U/13V/13W/13X/13Y/13Z, 
14A/14B, 15A/15B/15C, 21A/21B/21C, and contain no new matter. 

Attached hereto is a marked-up version of the changes 
made to the specification and claims by the current amendment. The 

attached page is captioned "Version with markings to show 

changes made." 

Applicants further believe no fees are due, however, if 
this is in error, please debit Deposit Account No. 07-1185 on which 
the undersigned is allowed to draw. 

Respectfully submitted, 

Beajamin Aaron Adler, Ph.D., J.D. 
Counsel for Applicant 
Registration No. 35,423 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 



IN THE SPECIFICATION; 

Paragraph beginning at line 2 of page 7 has been amended as 
follows: 

Figures 1A and IB. Ptgr4r Phylogenetic relationship of the 
11 viral genomes described in this patent application (highlighted) to 
representatives of all major HIV- 1 (group M) subtypes in gag (Figure 
1A) and env (Figure 1B 1 regions. Trees were constructed from full- 
length gag and env nucleotide sequences using the neighbor joining 
method (see text for details of methodology). Horizontal branch 
lengths are drawn to scale; vertical separation is for clarity only. 
Values at the nodes indicate the percent bootstraps in which the 
cluster to the right was supported (bootstrap values of 75% and 
higher are shown). Asterisks denote hybrid genomes as determined 
by additional analyses. Brackets at the right represent the major 
sequence subtypes of HIV-1 group M. Trees were rooted by using 
SIVcpzGAB as an outgroup. 

Paragraph beginning at line 12 of page 7 has been amended as 
follows: 

Figures 2A-2J. Fig. 2. Diversity plots comparing the 
sequence relationships of the 11 viral genomes described in this 
patent application to each other and to reference sequences from the 
database. In each of Figures 2A-2J panels A - J. the sequence named 
above the plots is compared to the sequences listed at the right. 



I H ill i 1 1 iii ■. — m — i — i — m 1 n — : 1 ■ u niii 




U4SS, LAI, C2220, and NDK are published reference sequences for 
subtypes A, B, C and D, respectively. Distance values were calculated 
for a window of 500 bp moved in steps of 10 nucleotides. The x-axis 
indicates the nucleotide positions along the alignment (gaps were 
stripped and removed from the alignment). The positions of the 
start codons of the gag, pol, vif, vpr, env, and nef genes are shown. 
The y-axis denotes the distance between the viruses compared (0.0S 
= 5% divergence). 

Paragraph beginning at line 22 of page 7 has been amended as 
follows: 

Figures 3A-3I. Fig. 3. Exploratory tree analysis. 
Neighbor joining trees were constructed for a 500 bp window moved 
in increments of 100 bp along the multiple genome alignment. Trees 
depicting discordant branching orders among four of the 11 
sequences included in this patent application are shown in Figures 
3A-3I panels A-I (hybrid sequences are boxed). The position of each 
tree in the alignment is indicated; subtypes are identified by 
brackets. Numbers at nodes indicate the percentage of bootstrap 
values with which the adjacent cluster is supported (only values 
above 80% are shown). Branch lengths are drawn to scale. 

Paragraph beginning at line 30 of page 7 has been amended as 
follows: 

Figures 4A and 4B. Ftgr-4r Recombination breakpoint 
analysis for 92RW009.6 and 93BR029.4 (Figure 4 A) Bootstrap plots 
depicting the relationship of 92RW009.6 to representatives of 
subtypes A and C, respectively." Trees were constructed from the 
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multiple genome alignment and the magnitude of the bootstrap value 
supporting the clustering of 92RW009.6 with U455 and 92UG037.1 
(subtype A), or C2220 and 92BR025.8 (subtype C), respectively, was 
plotted for a window of 500 bp moved in increments of 10 bp along 
the alignment. Regions of subtype A or C origin are identified by 
very high bootstrap values (>90%). Points of cross-over of the two 
curves indicate recombination breakpoints. The beginning of gag, 
pol, vif, vpr, env and nef open reading frames are shown. The y-axis 
indicates the percent bootstrap replicates, which support the 
clustering of 92RW009.6 with representatives of the respective 
subtypes. ( Figure 4B ^> Bootstrap plots depicting the relationship of 
93BR029.4 to representatives of subtype B and F, respectively. 
Analyses are as in (Figure 4 A), except that bootstrap values 
supporting the clustering of 93BR029.4 with SF2, OYI, MN, LAI and 
RF (subtype B), or 93BR020.1 (subtype F), respectively, were plotted. 
Subtype D viruses were excluded from this analysis because of their 
known close relationship with subtype B viruses. 

Paragraph beginning at line 16 of page 8 has been amended as 
follows: 

Figures 5A-5D. Fig. 5. Recombination breakpoint 
analysis of 92NG083 ? 2 and 92NG003.1. Neighbor joining trees 
depicting discordant branching orders of 92NG003.1 and 92NG083.2 
in regions delineated by breakpoints identified by distance plots (not 
shown) are shown in Figures SA-5D p anel s A - D (hybrid sequences 
are boxed). The position of each tree in the alignment is indicated; 
subtypes are identified by brackets. Numbers at nodes indicate the 
percentage of bootstrap values with which the adjacent cluster is 




supported (only values above 80% are shown). Branch lengths are 
drawn to scale. 

Paragraph beginning at line 27 of page 8 has been amended as 
follows: 

Figures 7A-7C. Fig. 7. Subtype specific genome features. 
( Figure 1A ) Alignment of deduced Tat (region encoded by second 
exon) amino acid sequences. Consensus sequences were generated 
for available representatives of all major subtypes (question marks 
indicate sites at which fewer than 50% of the viruses contain the 
same amino acid residue). Dashes denote sequence identity with the 
consensus sequence, while dots represent gaps introduced to 
optimize alignments. A vertical box highlights a premature Tat 
protein truncation (asterisk) which is present in 11 of 15 subtype D, 
and 4 of 52 subtype B viruses (frequencies are listed in the column 
on the right). (Figure 7 B) Alignment of deduced Rev (region encoded 
by the second exon) protein sequences. (Figure 7Q Alignment of 
deduced Vpu protein sequences. 

Paragraph beginning at line 19 of page 9 has been amended as 
follows: 

Figures 10A-10K. Fig. 10. Exploratory tree analysis. 
Neighbor joining trees were constructed for a 400 bp window moved 
in increments of 10 bp along the multiple genome alignment. Trees 
Fi gures 10A-10K panel A - K depict the discordant branching orders 
for 94CY032.3 (highlighted). The position of each tree in the 
alignment is indicated; subtypes are identified by brackets. Numbers 
at nodes indicate the percentage of bootstrap values with which the 
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adjacent cluster is supported (only values above 80% are shown). 
Branch lengths are drawn to scale. 

Paragraph beginnng at line 16 of page 10 has been amended as 
follows: 

Figures 13A-13Z. Fig. 13. Nucleotide sequences of 
alignment of the 11 near full-length HTV-1 sequences included in 
this patent application. Sequences were aligned using CLUSTAL W 
and adjusted manually using the sequence editor MASE. Dots 
indicate gaps introduced to optimize the alignment. The beginning 
and end of all open reading frames are indicated by arrows above or 
below the alignment. The homologies between the sequences of 
nucleotides in the eleven independent clones are indicated by 
dashes. 

Paragraph beginning at line 25 of page 10 has been amended as 
follows: 

Figures 14A and 14B. Fig. 1 4 . Amino acid sequence 
alignments of the Gag polypeptides encoded by the 11 near full- 
length HIV-1 sequences included in this patent application. The 
homologies between the sequences of amino acids in the various 
polypeptides encoded by the eleven independent clones are 
indicated by dashes. Sequences of amino acids present uniquely in 
the various polypeptides (as compared to the corresponding 
polypeptides of the other ten clones) are indicated by letters, i.e., the 
sequences themselves. 
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Paragraph beginning at line 1 of page 11 has been amended as 
follows: 

Figures 15A-15C. Fig. 15. Amino acid sequence 
alignments of the Pol polypeptides encoded by the 11 near full- 
length HIV-1 sequences included in this patent application. The 
homologies between the sequences of amino acids in the various 
polypeptides encoded by the eleven independent clones are 
indicated by dashes. Sequences of amino acids present uniquely in 
the various polypeptides (as compared to the corresponding 
polypeptides of the other ten clones) are indicated by letters, i.e., the 
sequences themselves. 

Paragraph beginning at line 12 of page 12 has been amended as 
follows: 

Figures 21A-21C. Fig. 21. Amino acid sequences of 
alignments of the Env polypeptides encoded by the 11 near full- 
length HIV-1 sequences included in this patent application. The 
homologies between the sequences of amino acids in the various 
polypeptides encoded by the eleven independent clones are 
indicated by dashes. Sequences of amino acids present uniquely in 
the various polypeptides (as compared to the corresponding 
polypeptides of the other ten clones) are indicated by letters, i.e., the 
sequences themselves. 

Paragraph beginning at line 27 of page 12 has been amended as 
follows: 

The present invention relates to the determination of the 
nucleic acid sequences of the complete or near complete genomes of 
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11 non-subtype B HTV-1 viruses isolated from primary isolates 
collected at major epicenters of the global AIDS pandemic. The 
nucleotide sequences of these 1 1 viruses are shown in Figures 13A- 
13Z Fig. 13 (SEQ ID NOS: _ to _). 

Paragraph beginning at line 26 of page 14 has been amended as 
follows: 

The present invention relates to nucleic acids having the 
genomic sequence of any one of the 11 molecular clones for non- 
subtype HIV-1 isolates of this invention as shown in Figures 13A- 

13Z Fig. 13 (SEQ ID NOS: to ), as well as fragments (or partial 

sequences) thereof. The invention also relates to nucleic acids having 
complementary (or antisense) sequences to the sequences shown in 

Figures 13A-13Z F*fr-t3- (SEQ ID NOS: to ), as well as fragments 

(or partial sequences) thereof. Partial sequences my be obtained by 
various mehtods, including restriction digestion of nucleic acids with 

sequences shown in Figures 13A-13Z Fjgr-43- (SEP ID NOS: to ), 

PCR amplification, and direct synthesis. Partial sequences may be all 
or part of the LTR and/or other untranslated regions of the genomes 
of one or more of the 1 1 viral clones of this invention, and/or all or 
part of the genes encoding the Gag, Pol, Vif, Vpr, Env, Tat, Rev, Nef 
and Vpu proteins and/or complementary (or antisense) sequences 
thereof. Nucleic acids of the invention also include cDNA, mRNA, and 
other nucleic acids derived from the genomic sequences of one or 
more of these 11 HTV-1 clones. Sequences of the genes encoding Gag, 
Pol, Vif, Vpr, Env, Tat, Rev, Nef and Vpu are identified in Figures 
13A- 1 3Z Fig. 13. 



26 




Paragraph beginning at line 18 of page 16 has been amended as 
follows: 

The nucleic acid probes used in the detection methods set 
forth above are derived from nucleic acid sequences shown in 
Figures 13A-13Z Ftfr-43 (SEQ ID NOS: _ to __). The size of such 
probes is at least 10-12 bases long, more usually at least about 19 
bases long, more usually from about 200 to 500 bases, and often 
exceeding 10000 bases. 

Paragraph beginning at line 7, of page 17, has been amended as 
follows: 

The nucleic acid probes used in the detection methods set 
forth above are derived from sequences substantially homologous to 
one or more of the sequences shown in Figures 13A-13Z Fig. 13 fSEQ 

ID NOS: to ), or their complementary sequences. By 

"substantially homologous", as used throughout the specification and 
claims to describe the nucleic acid sequence of the present invention, 
is meant a high level of homology between the nucleic acid sequence 
and one or more of the sequences of Figures 13A-13Z Fig. 13 (SEQ ID 

NOS: to ), or its complementary sequence. Preferably, the level 

of homology is in excess of 80%, more preferably in excess of 90%, 
with a preferred nucleic acid sequence being in excess of 95% 
homologous with a portion of one or more of the sequences shown in 

Fi gures 13A-13Z Fig. 13 (SEP ID NOS: to ), or its complement. 

The size of such probes is usually at least 20 nucleotides, more 
usually from about 200 to 500 nucleotides, and often exceeding 1000 
nucleotides. 
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Paragraph beginning at line 28 of page 17 has been amended as 
follows: 

The methods for analyzing the RNA for the presence of 
the viruses of this invention include Northern blotting (94), dot and 
slot hybridization, filter hybridization (95), Rnase protection (93), 
and reverse-transcription polymerase chain reaction (RT-PCR)(96). 
A preferred method is RT-PCR. In this method, the RNA can be 
reverse transcribed to first strand cDNA using a nucleic acid primer 
or primers derived from one or more of the nucleotide sequences 
shown in Figures 13A-13Z (SEQ ID NOS: _ to __). Once the 

cDNAs are synthesized, PCR amplification is carried out using pairs of 
primers designed to hybridize with sequences in the genomes of one 
or more of the non-subtype B HTV-1 viruses of this invention which 
are an appropriate distance apart (at least about SO bases) to permit 
amplification of the cDNA and subsequent detection of the 
amplification product. Each primer of a pair is a single-stranded 
nucleic acid of about 20 to about 60 bases in length where one 
primer (the "upstream" primer) is complementary to the originial 
RNA and the second primer (the "downstream" primer) is 
complementary to the first strand of cDNA generated by reverse 
transcriptions of the RNA. The target sequence is generally about 
100 to about 300 bases in length but can be as large as 500-1500 
bases or more, e.g., 9,000 bases. Optimization of the amplification 
reaction to obtain sufficiently specific hybridization to the nucleotide 
sequences of these viruses is well within the skill in the art and is 
preferably achieved by adjusting the annealing temperature. 
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Paragraph beginning at line 27 of page 21 has been amended as 
follows: 

The polypeptides of this invention consist of at least 6-12 
amino acids, more preferably at least 13-18 amino acids, even more 
preferably at least 19-24 amino acids and most preferably at least 
25-30 amino acids encoded by, or otherwise derived from, any one 
of the genomic sequences shown in Figures 1 3A-1 3Z Pig. 1 3 (SEP ID 
NOS: _ to _). 

Paragraph beginning at line 9 of page 31 has been amended as 
follows: 

The present invention further relates to computer- 
generated alignments of any or more of the nucleotide sequences 

shown in Figures 13A-13Z Fig. 13 (SEQ ID NOS: to ). Computer 

analysis of the nucleotide sequences, such as the one shown in 
Figures 13A-13Z Ftgr-13. can be carried out using commercially 
available computer program known to one skilled in the art. 

Paragraph beginning at line 14 of page 31 has been amended as 
follows: 

In one embodiment, the sequences shown in Figures 

13A-13Z Fig. 13 (SEQ ID NOS: to ) are aligned by the computer 

program CLUSTAL (67) and adjusted with multiple-aligned sequence 
editor (12). The computer analysis results in the distribution of 11 
sequences into various genotypes. Five of these sequences represent 
non-recombinant members of HIV-1 subtypes, and the other six 
sequences represent HIV-1 intersubtype recombinants. 
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Paragraph beginning at line 18 of page 32 has been amended as 
follows: 

The multiple computer-generated alignments of 
nucleotide sequences are shown in Figures 13A-13Z Fig. 13. The 
multiple computer-generated alignments of encoded amino acid 
sequences are shown in Figures 14-22. These alignments serve to 
highlight regions of homology and non-homology between different 
sequences and hence, can be used by one skilled in the art to design 
oligonucleotides and polypeptides useful as reagents in diagnostic 
assays for HIV-1. 

Paragraph beginning at line 3 of page 41 has been amended as 
follows: 

To determine the phylogenetic relationships of the 
viruses described herein, evolutionary trees from full length gag and 
env sequences were first constructed. This was done to confirm the 
authenticity of previously characterized strains, classify the new 
viruses, and compare viral branching orders in trees from two 
genomic regions. The results confirmed a broad subtype 
representation among the selected viruses (F igures 1A and IB F tft—^. 
Strains fell into six of the seven major (non-B) clades, including three 
for which full length sequences are not available (i.e., F, G and H). 
However, comparison of the gag and env topologies also identified 
three strains with discordant branching orders. 92RW009.6 grouped 
with subtype C viruses in gag, but with subtype A viruses in env. 
Similarly, 96BR029.4 clustered with subtype B viruses in gag, but 
with subtype F viruses in env. 94CY017.41 appeared to cluster 
withing subtype A viruses in env, but fell into an unknown subtype 
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in gag. However, characterization of the latter strain is still ongoing. 
These different phylogenetic positions were supported by high 
bootstrap values and thus indicated that these strains were 
intersubtype recombinants. 

Paragraph beginnng at line 27 of page 43 has been amended as 
follows: 

To examine the phylogenetic position of the newly 
derived strains relative to each other and to the reference sequences 
over the entire genome, exploratory tree analyses were performed 
using the same multiple genome alignment generated for the 
diversity plots ( Figures 3A-3I Ftpr-3l A total of 79 trees were 
constructed for overlapping fragments of 500 bp, moved in 100 bp 
increments along the alignment. As expected, four genomes were 
identified that clustered in different subtypes in different parts of 
their genome. These included 93BR029.4 which alternated between 
subtypes F and B, 92RW009.6 which alternated between subtypes A 
and C, and 92NG083.2 and 92NG003.1 which grouped either 
independently or within subtype A. Interestingly, the latter two 
strains exhibited distinct patterns of mosaicism. In trees spanning 
the region 3501-4000, 92NG003.1 clustered within subtype A,, while 
92NG083.2 clustered independently, presumably representing 
subtype G. In contrast to these strains, there was no evidence for a 
hybrid genome structure in 94IN476.104, 96ZM651.8, 96ZM751.3, 
93BR020.1 or 90CF056.1. These viruses branched consistently in all 
regions analyzed. Based on these findings and the results from the 
diversity plots, it appeared that five of the eleven selected HIV-1 
strains represent non-recombinant reference strains for subtypes C 
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(94IN476.104, 96ZM651.8, 96ZM751.3), F (93BR020.1) and H 
(90CF056.1), respectively, while at least five are intersubtype 
recombinants. CY017.41 may be recombinant, but work is in progress 
in this regard. 

Paragraph beginning at line 19 of page 44 has been amended as 
follows: 

To map the location of the recombination breakpoints in 
92RW009.6 and 93BR029.4, bootstrap plots and informative site 
analyses were used (18,52,53). Unrooted trees were constructed 
which included U455, 92UG037.1, LAI, MN, OYI, SF2, RF, C2209, 
92BR025.1, NDK, ELI, Z2Z6, 93BR020.1 and 90CF056.1; then the 
magnitude of the bootstrap values supporting (i) the clustering of 
92RW009.6 with members of subtype A (U455, 92UG037.1) or C 
(2220, 92BR025.8), as well as (ii) the clustering of 93BR029.4 with 
members of subtype B (LAI, MN, OYI, MN, RF) or F (92BR020.1) was 
determined (in the latter case subtype D viruses were excluded 
because of their known close relationship to subtype B viruses). 
Figures 4A and 4B depict Fig. 4 de p icts t he results of 797 such 
phylogenetic analyses generated for each genome, performed on a 
window of 500 nucleotides moved in steps of 10 nucleotides. Very 
high bootstrap values (>80%) supporting the clustering of 92RW009.6 
with subtype C were apparent in gag, the 3' two-thirds of pol, and 
nef. By contrast, significant branching of 92RW009.6 with subtype A 
was apparent in the gag/ppl overlap and the eny region. In a small 
region (4,000 to 4,200) in the middle of the genome, 92RW009.6 
appeared not to cluster significantly with either subtype, but further 
inspection revealed that this was due to a small number of 
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informative sites. These data thus indicated four points of 
recombination crossovers between subtypes A and C (Fjgma Jig. 4A). 
A similar analysis identified six recombination breakpoints between 
subtypes B and F in 93BR029.4 (Eigme. Figr 4B). These included two 
more (in gag) than were apparent from the diversity plot analysis 
(compare Figures 2A-2J indicating a greater sensitivity of this 

approach. 

Paragraph beginning at line 21 of page 46 has been amended as 
follows: 

Because of the lack of a full length subtype G reference 
sequence, recombination breakpoint analysis of 92NG003.1 and 
92NG083.2 required a different approach. The analyses summarized 
in Figures 2A-2J Ftfc-£ and Figures 3A-3I gtft-3r Suggested that these 
two viruses contained subtype A sequences in the middle of their 
genome. To attempt to confirm this, and to define the extent of these 
putative subtype A fragments, a more detailed diversity plot 
analysis of the viral middle region (between position 3,000 and 
6,000) was performed using different viral strains and varying 
window sizes (ranging from 200 to 400 bp) to examine the extent of 
sequence divergence of 92NG083.2 and 92NG003.1 from members of 
other subtypes, including subtype A. Diversity plots for 92NG003.1 
compared to U455, C2220, NDK and 92NG083.2 and for 92NG083.2 
compared to U455, C2220, NDK and 92NG003.1 depicted 
representative results (using a window size of 300 bp moved in steps 
of 10 bp along the alignment)(data not shown). Similar to the data 
shown in Fi gures 2A-2J Ftg^rr the two "subtype G" viruses are 
roughly equidistantly related to members of subtypes A (U455), 
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C(C2220), and D(NDK), except for two regions in 92NG003.1 and one 
region n 92NG083.2 where both viruses are disproportionately more 
closely related to U455 than they are to each other. Noting the 
points at which the "G"-A distance increases or decreases relative to 
the others allowed the tentative identification of recombination 
breakpoints. For example, at position 3400, the U4S5 plot falls 
whereas the C2220, NDK and 92NG083.2 plots do not, and around site 
3600 the U455 plots crosses the 92NG083.2 plot. Bearing in mind 
the window size of 300 nucleotides, this finding suggested that a 
recombination cross-over occurred around position 3500. Similar 
"G"-A plot crossings around positions 3800, 4200 and 5200 (in the 
diversity plot for 92NG003.1), and around positions 4200 and 4800 
(in the diversity plot for 92NG083.2), suggested additional 
recombination breakpoints. - 

Paragraph beginning at line IS of page 47 has been amended as 
follows: 

Phylogenetic trees were then constructed using the 
regions of sequence defined by these putative breakpoints rFigures 
5A-5D This analysis generally supported the conclusion 

drawn from the diversity plots, i.e., 92NG003.1 clustered with 
subtype A viruses in the region between 3501 and 3800, whereas 
92NG083.2 did not; and both 92NG003.1 and 92NG083.2 clustered 
with subtype A viruses in the region 4201 and 4800. However, 
neither the diversity plot nor the tree analysis allowed the definition 
of the boundaries of the subtype A fragments with certainty. 
Nevertheless, the data indicated that (i) both 92NG083.2 and 
92NG003.1 represent G/A recombinants, (ii) that they are the result 
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of different recombination events because some of their breakpoints 
are clearly different, and (iii) that 92NG083.2 likely encodes a non- 
recombinant pol gene. A schematic representation of the mosaic 
genomes of 92NG083.2 and 92NG 003.1 is shown in Figure Fip. 6. 

Paragraph beginning at line 29 of page 47 has been amended as 
follows: 

Having classified the new viruses with respect to their 
subtype assignments, their sequences were examined for clade- 
specific signature sequences. Comparing deduced amino acid 
sequences gene by gene, several subtype specific features were 
found in ( Figures 7A-7D Ftgr-y>. For example, most subtype D viruses 
contain an in-frame stop codon in the second exon of tat, which 
removes 13 to 16 amino acids from the carboxy terminus of the Tat 
protein ( Figure Ftgr 7A). Similarly, all subtype C viruses (including 
94IN476.104, 96ZM651.8 and 96ZM751.3) contain a stop codon in 
the second exon of rev which would be predicted to shorten this 
protein by 16 amino acids ( Figure Figr 7B). Subtype C viruses also 
contain a IS base pair insertion at the 5' end of the vpu gene (Figure 
Figr 7C) which extends the putative membrane spanning domain of 
the Vpu protein by 5 amino acids (data not shown). Although these 
changes are unlikely to alter the function of the respective gene 
prodcts in a major way (e.g., the known functional domains of both 
Tat and Rev proteins are not affected by these changes), it is possible 
that they could influence their mechanism of action in a subtle (but 
nevertheless biologically important) manner. 
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Paragraph beginning at line 4 of page 54 has been amended as 
follows: 

To map potential recombination breakpoints in this 
remaining region, four recently reported, partial but non-mosaic 
subtype G sequences from Mali which spanned the vif/vpr region 
and thus bridged the "subtype A gap" or 92NG083.2 were used (77). 
A set of distance plots that compare 94CY032.3 to one of these newly 
derived G sequences (9SML04S) as well as representatives of 
subtype A (U455), B (MN), and D (ELI), respectively, were 
constructed (data not shown). Consistent with the results from the 
exploratory tree analysis ( Figures 4 A and 4B Ftgr-^. 94CY032.3 was 
disproportionately more closely related to U4S5 in the 5' end and 3' 
thirds of this fragment, suggesting the presence of subtype A-like 
segments. However, in the middle of the fragment, 94CY032.3 was 
clearly equidistant from U455 and the other subtypes, suggesting an 
independent position (diversity plots were generated for a window 
of 300 bp moved in increments of 10 bp). Thus, noting the points at 
which the "A" distance increased and decreased relative to the other 
distances allowed us to tentatively map the two remaining 
breakpoints, one at 4650 and the other at 5000. Trees constructed 
from sequences surrounding these two breakpoints (Eiguie_Hgr 12) 
confirmed that 94CY032.3 switched position from subtype A (Figure 
Figr 12; panel 4255-4650) to subtype I (panel 4651-5000), and back 
to subtype A (5001-5300; note, that the new subtype G sequences 
only cover the region between 4255 and 5300). 
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